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ABSTRACT 

The LANDSAT multispectral scanner (MSS) data have been analyzed with a 
view toward classification to identify wheat. The notion of spectral signature 
of a crop, a commonly used basis for classification, has been found to be in- 
adequate. Data analysis has revealed that the MSS data from agricultural sites 
are essentially two dimensional, and that the data from different sites and 
different acquisitions lie on parallel planes in the four-dimensional feature 
space. These results have been exploited to gain new insight into the data and 
to develop alternate models for classification. In particular, it has been 
found that the temporal pattern of change in the spectral response of a crop 
constitutes its signature and provides a basis for crop classification. 

1 . INTRODUCTION 

The classification of multispectral observations from agricultural sites 
is commonly based on the notion of spectral similarity of like ground covers 
in a scene. With this model, the data are characterized on the basis of a 
sample of training fields from each crop class of interest. The data from each 
class are usually assumed to be Gaussian, and, then, the characterization con- 
sists of computation of sample mean and dispersion matrix. These parameters 
are said to constitute the 'spectral signature' of the class and are used as a 
basis for classification of the test data Cl-33 . 

Experience with LANDSAT multispectral scanner (MSS) data, however, has 
shown this model generally to be inadequate for crop classification. While the 
within-field variability of data is small, the f ield-to-f ield variability is 
usually so large as to make the notion of representative fields of a crop class 
untenable. This difficulty is compounded by the lack of wide separability in 
the data from different crop classes. Both these factors depend upon the rela- 
tive biological phases of the different crops in the scene at the time of the 
data acquisition. In wheat identification problem, for example, it has been 
found that in most instances the data from any single acquisition at any time 
during the wheat crop calendar cannot be classified satisfactorily on the basis 
of the spectral similarity model. Actually, even with multitemporal data (i.e. 
merged data from multiple acquisitions at different times in the crop calendar 
of wheat), the different sets of training fields produce substantially differ- 
end classification results C43- 

The difficulty with basing classification on spectral signatures described 
above is illustrated in Figures la and lb. These figures provide typical plots 
of the mean vectors for several randomly selected wheat and nonwheat fields in 
a LANDSAT subframe covering a 5x6 nautical miles area in Kansas. The plots 
correspond to four acquisitions over the site during different biological 
phases (viz., crop establishment, green, heading, and mature) of the wheat 
crop. The sizes of the fields range from 50-100 ground resolution elements 
(pixels). The standard deviations in the four channels range from 0.9 to 3.5. 
Figure la suggests that the data from the wheat fields cannot reasonably be 
modeled as having been drawn from the same probability distribution. Actually, 
hypothesis tests for equality of mean vectors across wheat fields invariably 
fail at each stage of the crop. In a generalization of the model described 
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above, the data from each crop class are regarded to constitute a Gaussian mix- 
ture distribution C51 This model, though more realistic, is still not entirely 
adequate for a situation where training is based on data from sample of fields. 
Quite often the mixture distribution is found to have as many constituents as 
there are fields! The basic difficulty, of course, still is with the notion of 
distinct spectral crop classes and their representation in a sample of training 
fields . 

With this background, an extensive analysis of the LANDSAT MSS data was 
undertaken with the objective of discovering features of spectral response that 
constitute a signature of wheat. The data available for this study consisted 
of mean vectors and dispersion matrices for a number of known wheat and non- 
wheat fields from each of several sites with multiple acquisitions. The results 
of data analysis are given in the next section. 


2. SIGNATURE ANALYSIS 

The following findings on the LANDSAT MSS data from agricultural sites 
were reported by the authors in an earlier paper C6D . (1) The data from any 
acquisition are essentially two dimensional, and (2) the data from different 
acquisitions/sites essentially lie on parallel two-dimensional planes in the 
four-dimensional feature space. See also the related, independent work of 
Kauth and Thomas C73, who give an interesting phenomenological identification to 
the spectral measurements and report roughly similar conclusions on the dimen- 
sionality of the data. 

The above finding on dimensionality offers a significant benefit in terms 
of graphical display of the four-dimensional data. This can be done by finding 
representation of the data in a rotated coordinate frame with, say, the first 
two axes on the plane of the data and the remaining two orthogonal to it. In 
this representation, the first two (in-plane) components, giving the location 
of the data on the plane, essentially distill the 'information' from the four- 
channel MSS data; the last two (off-plane) components, measuring the deviation 
of the data from the plane, have only a very small range of values and are re- 
garded as noise. The relative positions of the data in the original four dim- 
ensional feature space are very nearly preserved in a data display based only 
on the first two components of the transformed data. Note that having iden- 
tified the plane, for our purpose, the orientation of the two orthogonal axes 
on it is entirely arbitrary C63. The coordinate frame, i.e., orthonormal trans- 
formation (see Appendix) , used in the graphical representation of data in this 
paper was chosen solely for clean displays. In plots of in-plane (off-plane) 
components, the first (third) component of the transformed data is plotted 
along the abscissa. 

Figures 2a and 2b give scatter plots of data from the Kansas site 
(acquisition 2) mentioned earlier. These data correspond to 22,932 ground 
resolution elements in the scene. The plots use characters •,1,2,..,9,A,B,..., 
Z to represent 36 increasing levels of concentrations in a cell. The character 
assignment is on a uniform scale in the range 1 through KMAX, specified on the 
plots. The plot in Figure 2a is typical; the data are densely packed in a 
roughly triangular region with no apparent cluster structure. The spectral 
similarity model, however, is predicated on the existence of cluster structure 
in the data. The scatter plot of the off-plane components is also typical; it 
demonstrates the two-dimensionality of the data. 

For each of the available acquisitions over several sites, the transformed 
mean vectors of a set of wheat and nonwheat fields were plotted on the plane of 
the data. Two such scatter plots are presented in Figures 3a and 4a. The 
former corresponds to a site in Kansas with registered data available from six 
acquisitions over the crop calendar of wheat. Figure 4a corresponds to a site 
in Oklahoma with eight acquisitions. The acquisition dates are given below 
each plot as five-digit numbers. The first two digits identify the year, and 
the last three the day of that year. These two sites were chosen for avail- 
ability of good-quality data from several acquisitions over each. In view of 
the small within-field variability, the data from all pixels of a field can be 


1474 


thought of as densely scattered about the mean. The wheat grown at both sites 
is of winter wheat variety. It is planted early in the fall, is dormant during 
winter, greens and matures during spring, and is harvested in early summer. 

Figures 3a and 4a illustrate the difficulty with the spectral signature 
model for wheat identification. The large field-to-f ield variability, as noted 
earlier, is generally compounded by the lack of strong separability in wheat- 
nonwheat data. For example, in Figure 4a, in one-half of the acquisitions the 
decision boundaries are not apparent to separate the wheat and nonwheat data 
from the training fields. Even in cases where such decision boundaries can be 
drawn, what can be said of classification of the test data? 

Experience with maximum likelihood classification with Gaussian (-mixture) 
model for data from the various crop classes has shown that the decision bound- 
aries determined by the different sets of training fields are substantially 
different. The more training fields there are of wheat, the greater is the 
amount of wheat discovered in the scene by classification of data C4D . The 
reason for this is apparent from Figures 3a and 4a. The difficulty is that the 
data from a sample of fields of a crop class is not representative of the popu- 
lation. A better characterization of the data would be obtained by taking a 
pixel, rather than a field, as a unit for training. For example, a 5% sample 
of pixels from the wheat fields in a scene would provide a better basis for 
characterization of the distribution of the wheat data than a set of pixels of 
the same size belonging to a sample of wheat fields. Such training process, 
however, is not deemed cost effective in large-scale utilization of the LANDSAT 
data . 


This basic difficulty in characterization of the data from different crop 
classes is resolved by examining the pattern of temporal changes in the spectral 
response. Again, the results on the data dimensionality have been crucial to 
this work: The temporal pattern can be plotted as a trajectory on the plane of 
the data by joining together the points representing the locations of the field 
means in successive acquisitions. Figures 3b, 3c, 4b, and 4c present these 
temporal trajectories for some of the training fields from the Kansas and Okla- 
homa sites. An asterisk marks the starting point of each trajectory. The 
scale of each plot, unless otherwise specified, is the same as that of the plot 
above it. 

These trajectories constitute a complete graphical description of the 
spectral-temporal response of the field. Note that for each of the two sites, 
the trajectories associated with wheat fields are similar, and are sufficiently 
distinct from the corresponding trajectories of nonwheat. Even for acquisitions 
where the wheat and nonwheat data had appeared confused, the corresponding pat- 
terns of spectral changes bear unique information for classification. Examin- 
ation of multitemporal data from a number of sites has revealed that in each 
case the pattern of temporal changes characterizes the crop and constitutes a 
valid signature. Supervised classification of the data can be based on features 
extracted from the temporal trajectories of the training fields C8D . 

Simple interpretations based on crop phenology can be associated with this 
pattern. It has been proposed , C7U, for example, that in our coordinate frame on 
the plane of the data, the abscissa and ordinate give measures of brightness 
and greenness, respectively, of the ground cover. Interpretation of the tem- 
poral trajectories of winter wheat in terms of the anticipated phenomenological 
changes generally supports this view, though this issue appears far from 
resolved. 

Note that the trajectories associated with wheat fields at the two sites 
have certain qualitative similarity. Both sets are sampled versions of a con- 
tinuous trajectory which appears to resemble an l (lower case script E) . The 
samnlinq times at the Oklahoma site (Figure 4b) , however, are such as to miss 
the distinguishing feature of the 'loop'. Distortion in trajectories can be 
introduced by atmospheric conditions, such as haze. The nature of this distor- 
tion, however, being common to data from all classes, would generally not mask 
the class-specific features. Such identification of the features of the 


1475 



temporal trajectory with the crop phenology and the atmospheric conditions 
would permit development of unsupervised crop-calendar tracking classification 
schemes . 


3. CONCLUSIONS 

Graphical representation of the LANDSAT MSS data acquired during the dif- 
ferent phases of the wheat crop has shown that wheat can be identified on the 
basis of its characteristic pattern of temporal changes in the spectral re- 
sponse. This pattern can be interpreted in terms of the crop calendar and the 
crop vigor. Features of this temporal pattern provide a basis for both super- 
vised and unsupervised classification of the data. These results, presented 
here in the context of wheat identification, are applicable to the general crop 
classification problem. 
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APPENDIX 


Orthonormal Transformation for the MSS Data 


The following orthonormal transformation was used for the graphical representa- 
tion of data in Figures 2-4. 


0.406 

-0.386 

0.723 

0.404 


0.600 

-0.530 

-0.597 

-0.039 


0.645 

0.535 

0.206 

-0.505 


0.243 

0.532 

-0.278 

0.762 


The rows of T define the bases vectors of the new coordinate frame. The first 
two rows span a subspace parallel to the plane of the data. 
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Figure 2a. Scatter Plot of the In-plane Components of Data 
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Figure 3 d. Scatter Plot of the Off-plane Components of Data 
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